Operator Language: A Program Generation Framework for Fast Kernels

نویسندگان

  • Franz Franchetti
  • Frédéric de Mesmay
  • Daniel S. McFarlin
  • Markus Püschel
چکیده

We present the Operator Language (OL), a framework to automatically generate fast numerical kernels. OL provides the structure to extend the program generation system Spiral beyond the transform domain. Using OL, we show how to automatically generate library functionality for the fast Fourier transform and multiple non-transform kernels, including matrix-matrix multiplication, synthetic aperture radar (SAR), circular convolution, sorting networks, and Viterbi decoding. The control flow of the kernels is data-independent, which allows us to cast their algorithms as operator expressions. Using rewriting systems, a structural architecture model and empirical search, we automatically generate very fast C implementations for state-of-the-art multicore CPUs that rival hand-tuned implementations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compiling Stream Kernels for Polymorphous Computing Architectures

Polymorphous Computing Architectures (PCA) have multiple modes of operation and can reassign resources allocated to these modes during program execution. Such architectures enable a single computational fabric to meet the diverse computing needs of complex applications that previously required multiple, distinct HW/SW solutions integrated into a system solution. The MONARCH chip is a PCA capabl...

متن کامل

FFT Program Generation for the Cell BE

The complexity of the Cell BE’s architecture makes it difficult and time consuming to develop multithreaded, vectorized, high-performance numerical libraries. Our approach to solving this problem is to use Spiral, a program generation system, to automatically generate and optimize linear transform libraries for the Cell. To extend the Spiral framework to support the Cell architecture, we first ...

متن کامل

Automatic Generation of Sparse Tensor Kernels with Workspaces

Recent advances in compiler theory describe how to compile sparse tensor algebra. Prior work, however, does not describe how to generate efficient code that takes advantage of temporary workspaces. These are often used to hand-optimize important kernels such as sparse matrix multiplication and the matricized tensor times Khatri-Rao product. Without this capability, compilers and code generators...

متن کامل

Adaptive Dynamic Scheduling of Fft on Hierarchical Memory and Multi - Core Architectures

In this dissertation, we present a framework for expressing, evaluating and executing dynamic schedules for FFT computation on hierarchical and shared memory multiprocessor / multi-core architectures. The framework employs a two layered optimization methodology to adapt the FFT computation to a given architecture and dataset. At installation time, the code generator adapts to the microprocessor...

متن کامل

Multi Objective Scheduling of Utility-scale Energy Storages and Demand Response Programs Portfolio for Grid Integration of Wind Power

Increasing the penetration of variable wind generation in power systems has created some new challenges in the power system operation. In such a situation, the inclusion of flexible resources which have the potential of facilitating wind power integration is necessary. Demand response (DR) programs and emerging utility-scale energy storages (ESs) are known as two powerful flexible tools that ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009